对比学习的标准方法是最大化数据不同观点之间的一致性。这些视图成对排序,使它们是正面的,编码对应于不同对象的视图对应的同一对象的不同视图或负面的视图。监督信号来自最大程度地提高正面对的总相似性,而为了避免崩溃,需要负面对。在这项工作中,我们注意到,当从数据的视图中形成集合时,考虑单个对的方法无法解释集合和集合间的相似性。因此,它限制了可用于训练表示形式的监督信号的信息内容。我们建议通过将对比对象作为集合进行对比,超越对比对象。为此,我们使用旨在评估集合和图形相似性的组合二次分配理论,并将设定对比度物镜作为对比度学习方法的正规化学方法。我们进行实验,并证明我们的方法改善了对度量学习和自我监督分类任务的学说。
translated by 谷歌翻译
本文的目标是人对象交互(HO-I)检测。 HO-I检测旨在找到与图像相互作用的交互和分类。研究人员近年来依靠[5]强大的HO-I对齐监督,近年来取得了重大改善。 Ho-i对齐监督对具有互动对象的人类,然后将人对象对与其交互类别对齐。由于收集这种注释是昂贵的,本文提出了检测HO-I,无需对齐监督。我们依靠图像级监控,只枚举图像中的现有交互而不指向它们发生的位置。我们的论文提出了三个贡献:i)我们提出对齐器,基于Visual-Conversion的CNN,可以检测HO-I,只有图像级监控。 ii)对齐器配有HO-I对齐层,可以学习选择适当的目标以允许检测器监控。 iii)我们在Hico-det [5]和V-Coco [13]上评估对齐 - 前者,并显示对准的是现有的图像水平监督Ho-i探测器的大幅度(4.71%从16.14%的地图改进在Hico-DET [5]上的20.85%)。
translated by 谷歌翻译
对不需要的扰动的鲁棒性是在现实世界中部署神经网络分类器的一个重要方面。常见的自然扰动包括噪声,饱和度,遮挡,视点变化和模糊变形。所有这些都可以由新提议的转换增强卷积网络建模。虽然通过向网络提供增强数据来培训网络的许多方法,但我们的目标是在网络架构中集成扰动,以实现改善和更普遍的鲁棒性。为了证明Wiggling权重始终改进分类,我们选择标准网络并将其修改为转换增强网络。在扰动的CiFar-10图像上,修改的网络提供比原始网络更好的性能。对于更小的STL-10数据集,除了提供更好的普遍稳健性之外,Wiggling甚至可以提高无禁止的清洁图像的分类。我们得出结论,即使在训练期间没有看到扰动,Wiggled的转换网络也获得了良好的鲁棒性。
translated by 谷歌翻译
我们考虑来自高维数据的信息压缩问题。在许多研究考虑到不可逆转的转变的压缩问题,我们强调了可逆压缩的重要性。我们介绍了具有伪基本架构的新阶段基于似的的AutoEncoders,我们调用伪可逆的编码器。我们提供了对原则的理论解释。我们在MNIST上评估高斯伪可逆编码器,其中我们的模型优于生成图像的锐度的WAE和VAE。
translated by 谷歌翻译
A "heart attack" or myocardial infarction (MI), occurs when an artery supplying blood to the heart is abruptly occluded. The "gold standard" method for imaging MI is Cardiovascular Magnetic Resonance Imaging (MRI), with intravenously administered gadolinium-based contrast (late gadolinium enhancement). However, no "gold standard" fully automated method for the quantification of MI exists. In this work, we propose an end-to-end fully automatic system (MyI-Net) for the detection and quantification of MI in MRI images. This has the potential to reduce the uncertainty due to the technical variability across labs and inherent problems of the data and labels. Our system consists of four processing stages designed to maintain the flow of information across scales. First, features from raw MRI images are generated using feature extractors built on ResNet and MoblieNet architectures. This is followed by the Atrous Spatial Pyramid Pooling (ASPP) to produce spatial information at different scales to preserve more image context. High-level features from ASPP and initial low-level features are concatenated at the third stage and then passed to the fourth stage where spatial information is recovered via up-sampling to produce final image segmentation output into: i) background, ii) heart muscle, iii) blood and iv) scar areas. New models were compared with state-of-art models and manual quantification. Our models showed favorable performance in global segmentation and scar tissue detection relative to state-of-the-art work, including a four-fold better performance in matching scar pixels to contours produced by clinicians.
translated by 谷歌翻译
Neuromorphic systems require user-friendly software to support the design and optimization of experiments. In this work, we address this need by presenting our development of a machine learning-based modeling framework for the BrainScaleS-2 neuromorphic system. This work represents an improvement over previous efforts, which either focused on the matrix-multiplication mode of BrainScaleS-2 or lacked full automation. Our framework, called hxtorch.snn, enables the hardware-in-the-loop training of spiking neural networks within PyTorch, including support for auto differentiation in a fully-automated hardware experiment workflow. In addition, hxtorch.snn facilitates seamless transitions between emulating on hardware and simulating in software. We demonstrate the capabilities of hxtorch.snn on a classification task using the Yin-Yang dataset employing a gradient-based approach with surrogate gradients and densely sampled membrane observations from the BrainScaleS-2 hardware system.
translated by 谷歌翻译
ClueWeb22, the newest iteration of the ClueWeb line of datasets, provides 10 billion web pages affiliated with rich information. Its design was influenced by the need for a high quality, large scale web corpus to support a range of academic and industry research, for example, in information systems, retrieval-augmented AI systems, and model pretraining. Compared with earlier ClueWeb corpora, the ClueWeb22 corpus is larger, more varied, of higher-quality, and aligned with the document distributions in commercial web search. Besides raw HTML, ClueWeb22 includes rich information about the web pages provided by industry-standard document understanding systems, including the visual representation of pages rendered by a web browser, parsed HTML structure information from a neural network parser, and pre-processed cleaned document text to lower the barrier to entry. Many of these signals have been widely used in industry but are available to the research community for the first time at this scale.
translated by 谷歌翻译
Generalization is an important attribute of machine learning models, particularly for those that are to be deployed in a medical context, where unreliable predictions can have real world consequences. While the failure of models to generalize across datasets is typically attributed to a mismatch in the data distributions, performance gaps are often a consequence of biases in the 'ground-truth' label annotations. This is particularly important in the context of medical image segmentation of pathological structures (e.g. lesions), where the annotation process is much more subjective, and affected by a number underlying factors, including the annotation protocol, rater education/experience, and clinical aims, among others. In this paper, we show that modeling annotation biases, rather than ignoring them, poses a promising way of accounting for differences in annotation style across datasets. To this end, we propose a generalized conditioning framework to (1) learn and account for different annotation styles across multiple datasets using a single model, (2) identify similar annotation styles across different datasets in order to permit their effective aggregation, and (3) fine-tune a fully trained model to a new annotation style with just a few samples. Next, we present an image-conditioning approach to model annotation styles that correlate with specific image features, potentially enabling detection biases to be more easily identified.
translated by 谷歌翻译
每年都会在医院中获得数百万个大脑MRI扫描,这比任何研究数据集的规模都要大得多。因此,分析此类扫描的能力可以改变神经成像研究。然而,由于没有自动化算法可以应对临床采集的高度可变性(MR对比度,分辨率,方向等),因此它们的潜力仍未开发。在这里,我们提出了Synthseg+,这是一个AI分割套件,首次可以对异质临床数据集进行强有力的分析。具体而言,除了全脑分割外,SynthSeg+还执行皮质细胞,颅内体积估计和自动检测故障分割(主要是由质量非常低的扫描引起的)。我们在七个实验中证明了合成++,包括对14,000张扫描的老化研究,在该研究中,它准确地复制了在质量更高的数据上观察到的萎缩模式。 Synthseg+公开发布是一种现成的工具,可在广泛设置中解锁定量形态计量学的潜力。
translated by 谷歌翻译
具有差异隐私(DP)的文本重写提供了具体的理论保证,可以保护个人在文本文档中的隐私。实际上,现有系统可能缺乏验证其隐私索赔的手段,从而导致透明度和可重复性问题。我们介绍了DP-Rewrite,这是一个开源框架,用于差异化文本重写,旨在通过模块化,可扩展和高度定制来解决这些问题。我们的系统结合了各种下游数据集,模型,培训前程序和评估指标,以提供一种灵活的方式来领导和验证私人文本重写研究。为了在实践中展示我们的软件,我们提供了一组实验,作为对熟练DP文本重写系统的案例研究,检测其预训练方法中的隐私泄漏。我们的系统公开可用,我们希望它将帮助社区使DP文本重写研究更容易访问和透明。
translated by 谷歌翻译